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In this comment, we investigate a common used algorithm proposed by Newman [M. E. J. New- 
man, Phys. Rev. E 64, 016132(2001)] to calculate the betweenness centrality for all vertices. The 
inaccurateness of Newman's algorithm is pointed out and a corrected algorithm, also with O(MN) 
time complexity, is given. In addition, the comparison of calculating results for these two algorithm 
aiming the protein interaction network of Yeast is shown. 
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Betweenness centrality, also called load or between- 
ness for simplicity, is a quite useful measure in the net- 
work analysis. This conception is firstly proposed by 
Anthonisse 1] and Freeman|2j and introduced to physics 
community by Newman|3j. The betweenness of a node v 
is defined as 



B(v) 



s^t,s^v 



(1) 



where a s t(v) is the number of shortest paths going from 
s to t passing through v and a s t is the total number of 
shortest paths going from s to t. The end points of each 
path is counted as part of the path[3j. Newman pro- 
posed a very fast algorithm taking only O(MJV) time to 
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FIG. 1: The two examples used to illuminate the difference 
between Newman's and the corrected algorithms, (a) The 
copy from Ref. [2], also bas been used as a sketch map for 
Newman's algorithm, (b) The minimal network that can il- 
luminate the difference. The hollow circles represent the ver- 
tices and the solid lines represent the edges. Each vertex is 
marked with a natural number inside the corresponding circle, 
and the number beside each vertex v is ao v . 



TABLE I: Calculation results of figure 1(a) 



Vertices 1 2 3 4 5 
Newman's 9 34§ 28± 22i 29± 21 f 
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TABLE II: Calculation results of figure 1(b) 
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calculate the betweenness of all vertices 3J , where M and 
N denote the number of edges and vertices, respectively. 
The whole algorithm processes are as follows. 

(1) Calculate the distance from a vertex s to every 
other vertex by using breadth-first search. 

(2) A variable b s v , taking the initial value 1, is assigned 
to each vertex v. 

(3) Going through the vertices v in order of their dis- 
tance from s, starting from the farthest, the value of b s v is 
added to corresponding variable on the predecessor ver- 
tex of v. If v has more than one predecessor, then b'l is 
divided equally between them. 

(4) Go through all vertices in this fashion and records 
the value b 3 v for each v. Repeat the entire calculation 
for every vertex s, the betweenness for each vertex v is 
obtained as 



B(«)=XX 



(2) 



Since to a vertex v's betweenness B(v), the contribu- 
tions of its predecessors are not equal, it is not proper 
to divide b 3 equally between them. Clearly, if the vertex 
v has n predecessors labelled as tii, tt2, • • • , u n and a sv 
different shortest paths to vertex s, then we have 
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The different shortest paths from s to v are divided into 
n sets Gi, G2, • • • , G n . The number of elements in Gi, 
that is also the number of different shortest paths from 
s to Ui , gives expression to the contribution of the pre- 
decessor Ui to t>'s betweenness. Therefore, the vertex v's 
betweenness, induced by the given source s, should be di- 
vided proportionally to a SUi rather than equally between 
its predecessors. The corrected algorithm is as follows. 

(1) Calculate the distance from a vertex s to every 
other vertex by using breadth-first search, taking time 
O(M). 

(2) Calculate the number of shortest paths from 
vertex s to every other vertex by using dynamic 
programming^, taking time O(M) too. The processes 
are as follows. (2.1) Assign a ss = 0. (2.2) If all the 
vertices of distance d(d > 0) is assigned (Note that the 
distance from s to s is zero), then for each vertex v whose 
distance is d+ 1, assign a sv = J2 U a su where u runs over 
all v's predecessors. (2.3) Repeat from step (2.1) until 
there are no unassigned vertices left. 

(3) A variable taking the initial value 1, is assigned 
to each vertex v. 

(4) Going through the vertices v in order of their dis- 
tance from s, starting from the farthest, the value of 
/3* is added to corresponding variable on the predeces- 
sor vertex of v. If v has more than one predecessor 
1*1,1/2,- •■ ,u n , is multiplied by a su Ja sv and then 
added to a SUi . 

(5) Go through all vertices in this fashion and records 
the value (3^ for each v. Repeat the entire calculation 
for every vertex s, the betweenness for each vertex v is 
obtained as 



(4) 



Clearly, the time complexity of the corrected algo- 
rithm is O(MN) too. Besides, one should pay attention 
to a more universal algorithm proposed by BrandesQ, 
which can be used to calculate all kinds of centrality 
based on shortest-paths counting for both unweighted 
and weighted networks. 

These two algorithms, Newman's and the corrected 
one, will give the same result if the network has a tree 
structure. However, when the loops appear in the net- 
works, the diversity between them can be observed. Fig- 
ure (1) exhibits two examples, the first one is copied from 
the Ref . [2] , and the second is the minimal network that 
can illuminate the difference between Newman's and the 
corrected algorithms. The comparisons between these 
two algorithms are shown in table (1) and (2). The two 
algorithms produce different results even for networks of 
very few vertices. 

In addition, we compare with the performances of 
these two algorithms on the protein interaction network 
of Yeast[6j. This network has 2617 vertices, but only 
its maximal component containing 2375 vertices is taken 
into account. Figure 2(a) and 2(b) report the absolute 
diversity and relative diversity between Newman's and 
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FIG. 2: The comparisons between Newman's and the cor- 
rected algorithms on the protein interaction network of Yeast, 
(a) and (b) shown the absolute diversity and relative diversity 
between Newman's and the accurate results, respectively, (c) 
is the Zipf plot of the 100 vertices with highest betweenness. 



the accurate (obtained from the corrected algorithm) re- 
sults, respectively. The departure is distinct and can not 
be neglected. Fortunately, the statistical features may 
be similar. Although the details of the Zipf plotQ of the 
top- 100 vertices are not the same, both the two curves 
obey power-law form with almost the same exponent. We 
also have checked that the scaling law[^, [j| of between- 
ness distribution in Barabasi- Albert networks |l0j is kept, 
while the power-law exponents are slightly changed. 

The measure of betweenness is now widely used to 
detect communities/modules structures [TH Il2| and to 
analysis dynamics upon networks. Since the statisti- 
cal characters of betweenness distributions obtained by 
Newman's and the corrected algorithm are almost the 
same, some researchers may have found the difference 
between these two algorithm but have not paid atten- 
tion to it. However, many previous works have demon- 
strated that a few nodes' betweennesses rather than the 
overall betweenness distribution, may sometimes, de- 
termine the key features of dynamic behaviors on net- 
works. Examples are numerous: these include the net- 
work traffics [ll Q 13 > the synchronization[l6l EH , 
the cascading dynamics [l9j. and so on. In figure 2(b), 
one can find that for many nodes the relative diversities 
betweenness those two algorithms exceed 10%, and even 
nearly 30% for a few nodes. Therefore, the difference can 
not be neglected especially in analyzing the networks dy- 
namics. 

Although Newman's algorithm does not agree with 
the definition of betweenness, .3], it may be more prac- 
tical especially for the large-scale communication sys- 
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terns wherein the routers do not know how many shortest 
paths there are to the destination. Even if they can save 
the information of all the successors' weights, to imple- 
ment the biased choices may bring additional costs in 



economy and technique. Hence just to choose with equal 
probability at each branch point may be more natural, 
which is in accordance with Newman's algorithm. 
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